Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: PDpolice Date: Aug 31, 2012 1:37am
Forum: web Subject: Re: robots.txt

I suggest you first do a search of the IA forums using robots.txt. The issue has been commented on several times. Then lay out an argument for the proper treatment of a web page which has had ownership changes. Done properly, this could be the foundation for further debates and possible policy changes at the Archive.

Please keep in mind the open nature of the forum does allow comments by those who do not have the Internet Archives best interest at heart.

Reply to this post
Reply [edit]

Poster: Not Allan Date: Aug 31, 2012 8:14am
Forum: web Subject: Re: robots.txt

The archive is a historical record of the web. If a site is scanned on 01/01/2000 then that is the record.

If the ownership of the site changes, what does that have to do with the historical record? Answer - nothing.

Right?

You say

"Then lay out an argument for the proper treatment of a web page which has had ownership changes."

I think I've done so, as a change in ownership should not give the new owner the right to change the historical record. The proper treatment is the leave the historical record as it is.

I don't really think this is debatable, do you?

This post was modified by Not Allan on 2012-08-31 15:14:00

Reply to this post
Reply [edit]

Poster: jory2 Date: Aug 31, 2012 8:38am
Forum: web Subject: Re: robots.txt

@Not Allan:
you say; "If the ownership of the site changes, what does that have to do with the historical record? Answer - nothing."
I don't think anyone would disagree with that.
What you failed to consider, and what is most important, was what the Rightful Owner(s) of the websites want.
Perhaps the "proper treatment" would have been for this website to have the consent and the legal permission of the Rightful owners in the first place.

Reply to this post
Reply [edit]

Poster: Not Allan Date: Aug 31, 2012 8:49am
Forum: web Subject: Re: robots.txt

If the archive finds itself publishing material that is copyrighted and the person owning the copyright wants to challenge the archive then they would have the normal means of doing so. But this is not the issue here. There are no copyright challenges involved.

The issue here is historical integrity.

Reply to this post
Reply [edit]

Poster: jory2 Date: Aug 31, 2012 9:41am
Forum: web Subject: Re: robots.txt

@Not Allan:
http://archive.org/about/terms.php
"The Archive does not endorse or sponsor any content in the Collections, nor does it guarantee or warrant that the content available in the Collections is accurate, complete, noninfringing, or legally accessible in your jurisdiction, and you agree that you are solely responsible for abiding by all laws and regulations that may be applicable to the viewing of the content. In addition, the Collections are provided to you on an as-is and as-available basis."

I agree 100% with the statement "the issue here is historical integrity." unfortunately this website has absolutely none.

You say; "If the archive finds itself publishing material that is copyrighted"
If?, what do you mean if? I know of zero websites old enough to be in the "public domain" void of Copyright.
You?


Reply to this post
Reply [edit]

Poster: Not Allan Date: Aug 31, 2012 10:42am
Forum: web Subject: Re: robots.txt

'I agree 100% with the statement "the issue here is historical integrity." unfortunately this website has absolutely none.'

I don't understand this. Why not? Isn't the purpose of this website to archive the internet, so, why not just do it, with the disclaimers in the terms you have noted?

Also, the 'new owners' are not owners of the earlier web site, they are the new owners of the web site name, that's all. I don't know their motives for blacking out the site with robots.txt, but I suspect that they are not honorable. Why should they be catered to?

Reply to this post
Reply [edit]

Poster: PDpolice Date: Aug 31, 2012 2:12pm
Forum: web Subject: Re: robots.txt

In my first post I mentioned that you should “Please keep in mind the open nature of the forum does allow comments by those who do not have the Internet Archives best interest at heart.” I could have listed the names of some of those you have now met. Some posters are similar to a ‘grass-roots’ organization funded by millionaires. Not everyone wants information to be available without constant payment to a corporation or group. Do not equate the posting with the Archive itself.


Please lay out a more thorough argument for the proper treatment of a web page which has had ownership changes. And if you can address the problem of personal information being archived it would help.

Reply to this post
Reply [edit]

Poster: Not Allan Date: Aug 31, 2012 9:07pm
Forum: web Subject: Re: robots.txt

As pointed out, it is only the ownership of the name that changes, the new owner (unless by agreement with the prior owner) has no access to the content of the earlier web site, that is, he has no access to the files that made up the earlier site.

Therefore the 'new owner' of the name has no claim whatever on the prior web site, and no responsibility for it, and thus should have no say regarding it.

I can't see any counter argument here, can you?

But, the case is even stronger I think. Suppose I own a website, and I now decide that I'd like to have prior incarnations of the site erased from the archive. Should I have that prerogative? I don't think so. Ownership has nothing to do with it that I can see.

As for personal information, I don't see any difference if its posted on the active web or archived. The reasons for removing it from the web would apply equally to removing it from the archive and thus the question regarding the archive is identical to the question for the active web.

After all, the archive is part of the the active web.

I suppose if I am the owner of a site (past and present) and I want to remove personal info then the archive should consider that on a case by case basis, and the removal should explicitly noted on the web page, and the removal should not be hidden and invisible).

This post was modified by Not Allan on 2012-08-31 21:46:57

This post was modified by Not Allan on 2012-08-31 21:49:36

This post was modified by Not Allan on 2012-08-31 21:50:24

This post was modified by Not Allan on 2012-09-01 04:07:05

Reply to this post
Reply [edit]

Poster: daffaela Date: Nov 15, 2013 11:44pm
Forum: web Subject: Re: robots.txt

No kidding. You used to be able to batch upload all the shows you wanted to submit to one host (no interaction thanks to .netrc) and then go to the contribution page after they were finished uploading. How long is the old upload ftp site going to be in operation?
Great info Thanks

Alfaonline.com Toko belanja online murah Promo heboh jual barang hanya Rp 1 - Toko belanja online murah - Alfaonline.com

Reply to this post
Reply [edit]

Poster: jory2 Date: Aug 31, 2012 10:55am
Forum: web Subject: Re: robots.txt

@ Not Allan:
I don't think you understand even the basics of copyright laws so there's no point in this discussion; for me anyway.
Take care!

Reply to this post
Reply [edit]

Poster: andy forester Date: Feb 10, 2013 1:07pm
Forum: web Subject: Re: robots.txt

I totally agree. Its the copyright law, you can ask any lawyer around.My blog - copyrighted?

Reply to this post
Reply [edit]

Poster: Not Allan Date: Aug 31, 2012 11:47am
Forum: web Subject: Re: robots.txt

I have now read through many of the earlier posts on the subject.

As I pointed out (below), the ownership of the web page has not changed, the ownership of the web site name has changed. The new owner has no access at all to the content of the old web site or any responsibility for it. So, why should ownership of the web site name allow the new owner to erase the old web site from the historical record?

So, reiterating, I think the proper policy is to maintain an accurate historical record.


This post was modified by Not Allan on 2012-08-31 18:47:06

Reply to this post
Reply [edit]

Poster: speederk Date: Oct 30, 2012 5:42pm
Forum: web Subject: Re: robots.txt

Here is another issue I would like to comment on.

I am trying to do a little research on an electronics company (Harmon Electronics harmonind.com ). They HAD a web page but the company was bought out by GE and thus changed names.

When their domain expired and a web squatter took the name over and placed a robots.txt block on the same address.

Now, trying to find the electronics companies archive is not available.

In order to keep IA motto accurate of "Universal access to all knowledge" it would be in the best interests of IA to scrub the databases and find out where the robots.txt file was implemented and allow access to prior work...if it still exists.

This post was modified by speederk on 2012-10-31 00:42:45

Terms of Use (10 Mar 2001)